List of AI News about micro language models
| Time | Details | 
|---|---|
| 
                                        2025-10-16 00:14  | 
                            
                                 
                                    
                                        NanoChat d32: Affordable LLM Training Achieves 0.31 CORE Score, Surpassing GPT-2 Metrics
                                    
                                     
                            According to Andrej Karpathy, the NanoChat d32 model—a depth 32 version trained for $1000—has completed training in approximately 33 hours, demonstrating significant improvements in key AI benchmarks. The model achieved a CORE score of 0.31, notably higher than GPT-2's score of 0.26, and saw GSM8K performance jump from around 8% to 20%. Metrics for pretraining, supervised fine-tuning (SFT), and reinforcement learning (RL) all showed marked increases (Source: Karpathy, Twitter; GitHub repo for NanoChat). Despite the model's low cost relative to frontier LLMs, Karpathy notes that user expectations for micro-models should be tempered, as they are limited by their size and training budget. The business opportunity lies in the rapid prototyping and deployment of small LLMs for niche applications where cost and speed are prioritized over state-of-the-art performance. Karpathy has made the model and training scripts available for reproducibility, enabling AI startups and researchers to experiment with low-budget LLM training pipelines.  |